Split Selection Methods for Classi cation
نویسنده
چکیده
Classiication trees based on exhaustive search algorithms tend to be biased towards selecting variables that aaord more splits. As a result, such trees should be interpreted with caution. This article presents an algorithm called QUEST that has negligible bias. Its split selection strategy shares similarities with the FACT method, but it yields binary splits and the nal tree can be selected by a direct stopping rule or by pruning. Real and simulated data are used to compare QUEST with the exhaustive search approach. QUEST is shown to be substantially faster and the size and classiication accuracy of its trees are typically comparable to those of exhaustive search.
منابع مشابه
An extensive comparison of recent classification tools applied to microarray data
Since most classi%cation articles have applied a single technique to a single gene expression dataset, it is crucial to assess the performance of each method through a comprehensive comparative study. We evaluate by extensive comparison study extending Dudoit et al. (J. Amer. Statist. Assoc. 97 (2002) 77) the performance of recently developed classi%cation methods in microarray experiment, and ...
متن کاملFeature Selection and Dualities in Maximum Entropy Discrimination
Incorporating feature selection into a classi cation or regression method often carries a number of advantages. In this paper we formalize feature selection speci cally from a discriminative perspective of improving classi cation/regression accuracy. The feature selection method is developed as an extension to the recently proposed maximum entropy discrimination (MED) framework. We describe MED...
متن کاملNavigala: an Original Symbol Classifier Based on Navigation through a Galois Lattice
This paper deals with a supervised classi ̄cation method, using Galois Lattices based on a navigation-based strategy. Coming from the ̄eld of data mining techniques, most literature on the subject using Galois lattices relies on selection-based strategies, which consists of selecting/ choosing the concepts which encode the most relevant information from the huge amount of available data. Generall...
متن کاملExtracting fuzzy classi cation rules with gene expression programming
In essence, data mining consists of extracting knowledge from data. This paper proposes an evolutionary system for discovering fuzzy classi cation rules. Fuzzy logic is useful for data mining especially in the case for performing classi cation task. Three methods were used to extract fuzzy classi cation rules using Evolutionary Algorithms: (1) genetic selection small number of large number of f...
متن کاملCloud Classi cation Using Error-Correcting Output Codes
Novel arti cial intelligence methods are used to classify 16x16 pixel regions (obtained from Advanced Very High Resolution Radiometer (AVHRR) images) in terms of cloud type (e.g., stratus, cumulus, etc.). We previously reported that intelligent feature selection methods, combined with nearest neighbor classi ers, can dramatically improve classi cation accuracy on this task. Our subsequent analy...
متن کامل